Policy iteration based feedback control
نویسندگان
چکیده
It is well known that stochastic control systems can be viewed as Markov decision processes (MDPs) with continuous state spaces. In this paper, we propose to apply the policy iteration approach in MDPs to the optimal control problem of stochastic systems. We first provide an optimality equation based on performance potentials and develop a policy iteration procedure. Then we apply policy iteration to the jump linear quadratic problem and obtain the coupled Riccati equations for their optimal solutions. The approach is applicable to linear as well as nonlinear systems and can be implemented on-line on real world systems without identifying all the system structure and parameters. 2007 Elsevier Ltd. All rights reserved.
منابع مشابه
Adaptive Optimal Control of Partially-unknown Constrained-input Systems using Policy Iteration with Experience Replay
This paper develops an online learning algorithm to find optimal control solutions for partially-unknown continuous-time systems subject to input constraints. The input constraints are encoded into the optimal control problem through a nonquadratic performance functional. An online policy iteration algorithm that uses integral reinforcement knowledge is developed to learn the solution to the op...
متن کاملAdaptive optimal control for continuous-time linear systems based on policy iteration
In this paper we propose a new scheme based on adaptive critics for finding online the state feedback, infinite horizon, optimal control solution of linear continuous-time systems using only partial knowledge regarding the system dynamics. In other words, the algorithm solves online an algebraic Riccati equation without knowing the internal dynamics model of the system. Being based on a policy ...
متن کاملRisk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost
The existence of an optimal feedback law is established for the risk sensitive optimal control problem with denumerable state space. The main assumptions imposed are irreducibility, and a near monotonicity condition on the one-step cost function. It is found that a solution can be found constructively using either value iteration or policy iteration under suitable conditions on initial feedback...
متن کاملRisk Sensitive Optimal Control: Existence and Synthesis for Models with Unbounded Cost
The existence of an optimal feedback law is established for the risk sensitive optimal control problem with denumerable state space and unbounded cost. It is found that a solution can be found constructively using value iteration or policy iteration.
متن کاملPreference-Based Policy Iteration: Leveraging Preference Learning for Reinforcement Learning
This paper makes a first step toward the integration of two subfields of machine learning, namely preference learning and reinforcement learning (RL). An important motivation for a “preference-based” approach to reinforcement learning is a possible extension of the type of feedback an agent may learn from. In particular, while conventional RL methods are essentially confined to deal with numeri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Automatica
دوره 44 شماره
صفحات -
تاریخ انتشار 2008